Skip to content

Module 15 of 15 · 📖 4 min read · ⏱ 30 min total

FI-DPA 15 MLOps — Modelle produktiv betreiben (EN)

Table of contents (6 sections)
  1. Concepts and Background
  2. Architecture Diagram
  3. Practical Steps
  4. Common Pitfalls
  5. Further Resources
  6. Knowledge Check

FI-DPA 15 MLOps — Operating Models in Production

In this module, you will learn the concepts and practical implementation steps for reliably operating machine learning models in production. You will understand how to version, monitor, and automatically update models when needed, while also considering ethical aspects and model explainability.

Concepts and Background

Model Registry
A central storage location for all models in various versions, managing metadata and lifecycle.
Versioning
The systematic storage of models with unique identifiers to enable reproducibility and rollbacks.
A/B Testing
A procedure for comparing two models where different user groups receive different versions to objectively evaluate performance.
Drift Detection
The continuous monitoring of model outputs and input data for deviations from expected behavior that could lead to performance degradation.
Explainable AI (XAI)
Methods for explaining predictions from machine learning models to create transparency and trust in the results.

Architecture Diagram

flowchart LR
    A[Data Source] --> B[Data Preprocessing]
    B --> C[Model Training]
    C --> D[MLflow Registry]
    D --> E[Model Deployment]
    E --> F[A/B Testing]
    F --> G[Monitoring]
    G --> H[Drift Detection]
    H --> I{Drift Detected?}
    I -->|Yes| J[Retraining]
    I -->|No| K[Production]
    J --> C

Practical Steps

  1. Install and configure MLflow Server. This serves as a central platform for managing your models.
  2. pip install mlflow
    mlflow server --host 0.0.0.0 --port 5000
  3. Create an MLflow experiment for your project to group all runs and models.
  4. mlflow create experiment --experiment-name "Customer Classification"
  5. Train a model and automatically register it in the MLflow Registry.
  6. mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        registered_model_name="customer_churn_model"
    )
  7. Mark a version of the model for A/B testing and deploy it.
  8. client = MlflowClient()
    client.transition_model_version_stage(
        name="customer_churn_model",
        version=1,
        stage="Staging"
    )
  9. Implement a monitoring pipeline for data and concept drift.
  10. from evidently.report import Report
    from evidently.metrics import DataDriftMetric
    
    report = Report(metrics=[DataDriftMetric()])
    report.run(reference_data=reference_df, current_data=new_data)
  11. Create a retraining workflow using Apache Airflow or similar tools.
  12. Implement SHAP integration for model prediction explainability.
  13. import shap
    explainer = shap.TreeExplainer(model)
    shap_values = explainer.shap_values(X_test)
  14. Establish ethical guidelines for model evaluation and selection.

Common Pitfalls

Further Resources

Knowledge Check

Four questions for self-assessment. Click on each question to see the correct answer and explanation.

What is the primary purpose of a Model Registry in MLOps?
  • A) The automatic training of models
  • B) A central storage location for models in various versions with metadata management
  • C) The visualization of model outputs
  • D) The data collection for training

Correct Answer: B. A Model Registry serves as a central storage for models in various versions and manages their metadata and lifecycle. Option A describes training, not storage. Option C is the task of XAI tools and Option D relates to data preprocessing.

What is the main purpose of A/B Testing in the context of ML models?
  • A) The automatic updating of models
  • B) The continuous monitoring of model outputs
  • C) The objective comparison of two models by deploying them to different user groups
  • D) The explanation of model outputs

Correct Answer: C. A/B Testing compares two models by having different user groups receive different versions to objectively evaluate performance. Option A describes retraining, Option B is Drift Detection and Option D is the task of XAI.

What is Explainable AI (XAI) in the context of MLOps?
  • A) A procedure for automatic model optimization
  • B) Methods for explaining model outputs for transparency and trust
  • C) A system for version management of training data
  • D) A protocol for model deployment

Correct Answer: B. XAI encompasses methods for explaining predictions from machine learning models to create transparency and trust. Option A describes hyperparameter optimization, Option C is part of data management and Option D refers to deployment processes.

What happens in the MLOps lifecycle when drift is detected?
  • A) The model is automatically archived in the registry
  • B) The system automatically performs retraining
  • C) The model is flagged for human review
  • D) The system alerts the development team

Correct Answer: B. When drift is detected, the system typically triggers an automatic retraining process to address the performance degradation. Option A is part of lifecycle management but not the immediate response to drift. Option C and D may occur depending on the implementation, but automatic retraining is the standard response to detected drift.